Fruit Phenolic and Triterpenic Composition of Progenies of Olea europaea subsp. cuspidata, an Interesting Phytochemical Source to Be Included in Olive Breeding Programs

Olea europaea subsp. cuspidata has a relatively low commercial value due to the low size and pulp to stone ratio of its drupes compared to commercial olive cultivars. Nevertheless, this subspecies could represent a valid source of useful traits for olive breeding. In the current work, the drupe metabolic composition (secoiridoids, flavonoids, simple phenols, triterpenic acids, etc.) of a progeny of 27 cuspidata genotypes coming from free pollination and their female parent was evaluated by applying a powerful LC-MS method. A total of 62 compounds were detected within the profiles; 60 of them were annotated and 27 quantified. From a quantitative point of view, the genotypes from the progeny of cuspidata showed quite different metabolic profiles to olive common cultivars (“Arbequina”, “Frantoio”, “Koroneiki” and “Picual”) used as controls. Cuspidata drupes were richer in terms of several bioactive compounds such as rutin, hydroxytyrosol glucoside, a few interesting secoiridoids and the compounds of m/z 421 and 363. The relationships among several secondary metabolites determined in the progeny inferred from the results of both PCA and cross-correlation analysis were explained according to metabolic biosynthesis pathways in olive drupes. These outcomes underlined the potential of cuspidata genetic resources as a source of potentially interesting variability in olive breeding programs.


Introduction
The genus Olea belongs to the family Oleaceae and is divided into three different subgenera: Olea, Tetrapilus and Paniculatae [1]. Six subspecies have been defined for Olea europaea L., which is popularly known as "The Olive Complex". The subsp. europaea (diploid), which can be found throughout the whole Mediterranean basin, is represented by two botanical varieties: cultivated olive (Olea europaea subsp. europaea var. europaea) and wild olive (Olea europaea subsp. europaea var. sylvestris). Additionally, five more non-cultivated sbspp. have been described: laperrinei (diploid), cuspidata (diploid), guanchica (diploid), maroccana (polyploid 6n) and cerasiformis (polyploid 4n) [2][3][4]. The geographical origin and domestication of olive tree remain unclear. It is usually accepted that olive tree domestication began in the Northern Levant approximately six thousand years ago [5]. Different paleobotanic and genetic investigations have hypothesized that the current cultivars arose from one or multiple random hybridizations between wild and domesticated Mediterranean genotypes. Both wild and cultivated olive trees have coexisted in human civilizations [4,6].
Nowadays, the cultivated olive tree is considered the most emblematic tree of the Mediterranean basin and is of undeniable economic importance. Spain tops the list of  To date, there are only few studies dealing with the characterization of different olive oils obtained from wild olives from various origins (Pakistan, Tunisia, Algerian or Portugal) [34][35][36][37][38][39]. Dabbou and co-authors, for instance, observed that oleasters could be potentially interesting, since they produced oils with good quality characteristics in terms of minor compounds (phenols and volatiles) compared to the "Chemlali Sfax" cultivar [38]. Similarly, Bouarroudj and colleagues highlighted the high potential of Algerian oleaster oils as phytochemical and genetic resources to improve the quality of olive oil [37]. Another thorough study has suggested that the use of wild germplasm in olive breeding programs will not have a negative impact on olive oil composition in terms of fatty acids, tocopherol content and tocopherol and phytosterol profiles, given that the selection of these compounds is conducted starting from early generations [15]. Unfortunately, the potential of cuspidata olive drupes regarding their phytochemical composition has not yet been deciphered and their differences with cultivated olives have been scarcely studied.
Therefore, the objective of the present work was: (i) to perform an in-depth characterization of the metabolic profile of cuspidata samples; (ii) to compare their compositional profiles regarding phenolic and triterpenic substances (qualitatively and quantitatively) with that of four olive common cultivars ("Arbequina", "Frantoio", "Koroneiki" and "Picual"); and (iii) to evaluate whether the subsp. cuspidata could represent a valid source of useful traits for cultivated olive, proving eventually the potential of this subspecies to be included in breeding programs.

Results and Discussion
2.1. Characterization of the Metabolic Profile of Progenies from Olea europaea subsp. cuspidata by LC-MS As stated in the Materials and Methods (see Section 3), liquid chromatography (LC) coupled with high-resolution mass spectrometry (HRMS) was used to perform a qualitative profiling of the extracts of the subsp. cuspidata fruit samples. A total of 62 compounds were detected within the profiles; a combination of accurate mass and isotopic distribution was used to calculate the theoretical elemental formula of the detected metabolites. The identity of some compounds was verified by using the commercial or isolated pure standards available in-house; for some other metabolites, however, we just provided a tentative identification based on a combination of experimental data (HRMS data and in-source fragmentation patterns), the expertise of our research group and the information previously described in the literature regarding olive fruit characterization [28,30,31,33]. Table 2 shows the qualitative exploration of progenies from Olea europaea subsp. cuspidata. Each row of the table includes the identity assigned to each analyte, to which chemical class it might belong, its molecular formula, retention time, experimental and theoretical m/z signals, error (ppm) and mSigma value, as well as the in-source fragments detected in MS.
Secoiridoids (40) made up the most numerous group of compounds, followed by flavonoids (10), pentacyclic triterpenes (5), simple phenols or related analytes (3) and organic acids (2). It should be noted that a large part of the identified compounds corresponded to glycosylated derivatives and isomers, especially in the case of secoiridoids. As far as secoiridoids are concerned, 22 analytes were structurally related to hydroxytyrosol (oleuropein derivatives), 3 to tyrosol (ligstroside derivatives) and 12 resulted to be oleoside-type and elenolic acid derivatives. In many of the genotypes evaluated, the compounds annotated as oleuropein, verbascoside, elenolic acid glucoside (isomer C), demethyl oleuropein, lucidumoside C, ligstroside and oleoside/secologanoside (isomer C) were the peaks with the highest relative intensity in the profiles. Similarly, several oleuropein-, ligstroside-, and elenolic acid-derived compounds, such as oleuropein aglycone isomers, demethyl ligstroside and acyclodihydroelenolic acid hexoside (isomer B), were found to be relevant in the chromatographic profile of the cuspidata samples. The presence of oleuropein and ligstroside aglycones in the drupes is the consequence of the overexpression of the β-glucosidase enzyme, which is involved in the ripening mechanism [28]. In this case, two isomers of oleuropein aglycon and some of its derivatives (dehydro oleuropein aglycone A and B, hydroxy decarboxymethyl oleuropein aglycone, and 10-hydroxy oleuropein aglycon A and B) were detected in cuspidata samples, while ligstroside aglycones were not detected.
The second most numerous group of compounds was flavonoids. In this category, we found the following substances: rutins A and B, luteolin 7-O-glucoside, luteolin rutinoside, luteolin glucoside isomers A, B and C, apigenin 7-O-glucoside, luteolin and apigenin. Luteolin 7-O-glucoside and luteolin glucoside isomer B (m/z 447.0937) and, in particular, rutin (m/z 609.1463) were the most abundant ones.
Within the category of pentacyclic triterpenes, five compounds were identified: maslinic acid (m/z 471.3479), betulinic acid (m/z 455.3529), oleanolic acid (m/z 455.3528), an isomer with m/z 455.3531 and a monohydroxylated derivative of maslinic acid. These compounds have been previously reported by other authors in olive fruit tissues of subspecies europaea [23,33]. Substances belonging to the chemical classes of simple phenols and organic acids were also found in the LC-MS profiles of cuspidata genotypes. Regarding simple phenols (or similar compounds), three compounds were identified: hydroxytyrosol glucoside (m/z 315.1081), oxydized hydroxytyrosol (m/z 151.0395) and phenylethyl primeveroside (m/z 415.1606). Organic acids were the most polar analytes of all those detected in the profiles, eluting at the beginning of the chromatogram. Within this category, quinic and citric acids were found in the samples. Only the first one (m/z 191.0550) was remarkable due to its intensity in the profile.
Three other substances, which were found in the profiles with high relative intensities, could not be identified with confidence. The peak with m/z 537.1605 (C 25 H 30 O 13 ) was tentatively assigned to fraxamoside, considering that its presence has been recently described in Greek olives by Kritikou and co-authors [40]. The MS/MS analysis described in their work agreed with some of our in-source fragments (m/z 323.0811 and 221.0273), which suggests that it could be the same compound they described. The second unknown peak was the one with m/z 363.1440, which could be a compound related to ligstroside aglycone (the predicted molecular formula was C 19 H 24 O 7 ), and the third one was the peak with m/z 421.1494 and molecular formula C 21 H 26 O 9 . Our hypothesis regarding the latter one is that it could be a secoiridoid derivative (oleuropein aglycone + C 2 H 4 O or ligstroside aglycone acetate). Some experiments are already in progress to be able to assign an identity to them in the near future.

Application of LC-MS for the Quantitative Evaluation of Samples under Study
From the identified compounds, a total of 27 metabolites were quantitatively assessed in the samples under study by using LC coupled to low-resolution (LR) MS ( Figure 1). The choice of the compounds to be quantified was mainly based on: (1) the compounds having a higher prevalence (in terms of area and intensity; i.e., they are more abundant) in the chromatographic profiles, and (2) having an appropriate pure standard to perform a proper quantification. We decided to quantify three flavonoids (luteolin glucoside (isomer B), luteolin 7-O-glucoside and rutin (isomer B)), one organic acid (quinic acid), three pentacyclic triterpenes (betulinic, oleanolic and maslinic acids), sixteen secoiridoids (caffeoyl 6-secologanoside, dihydro oleuropein, dehydro nuzhenide, β-hydroxy verbascoside, neonuzhenida, methoxy oleuropein (isomer A), oleuropein aglycone isomers A and B, demethyl ligstroside, acyclodihydroelenolic acid hexoside (B), oleoside/secologanoside (isomer C), ligstroside, lucidumoside C (isomer A), demethyl oleuropein, elenolic acid glucoside (isomer C), verbascoside and oleuropein), one simple phenol (hydroxytyrosol glucoside) and two unknown compounds, with m/z of 363 and 421, respectively. Most of the secoiridoids and the two unknown compounds were quantified in terms of oleuropein. β-hydroxy verbascoside, verbascoside, caffeoyl 6-secologanoside and demethyl ligstroside were quantified by using the calibration curve obtained with the pure standard of verbascoside. This seemed appropriate because a relatively low area was found for the latter compounds in the samples under study. Hydroxytyrosol glucoside was quantified with the hydroxytyrosol standard, and luteolin glucoside in terms of its isomer luteolin 7-O-glucoside. As expected, a highly significant correlation was found between fruit weight and oil content (r = 0.80, p < 0.001), with most cuspidata genotypes and their female parents, in the lower range of values for these two traits, respect to the four cultivars analyzed (Figure 2, upper left). The oil yield ranged from 10 to 45% approx. (fruit dry weight (%)) in cuspidata fruit, although most genotypes exhibited values between 10 and 25%. Similar contents were reported by Joshi and Gulfraz et al., ranging from about 20 to 28% for Olea ferruginea Royle in the north-west of India and from 33 to 39% in Pakistan, respectively [41,42]. Another study described lower values of oil yield by mill extraction for Olea ferruginea Royle from Pakistan, within the range from 11.1 to 12.5% [34]. A few cuspidata genotypes showed values for these two traits close to the ones obtained for the cultivars, which indicates that potentially interesting values for these attributes can be recovered in a single generation.

Fruit Weight, Oil Content and Total Compounds of Wild and Cultivated Olives
As expected, a highly significant correlation was found between fruit weight and oil content (r = 0.80, p < 0.001), with most cuspidata genotypes and their female parents, in the lower range of values for these two traits, respect to the four cultivars analyzed (Figure 2, upper left). The oil yield ranged from 10 to 45% approx. (fruit dry weight (%)) in cuspidata fruit, although most genotypes exhibited values between 10 and 25%. Similar contents were reported by Joshi and Gulfraz et al., ranging from about 20 to 28% for Olea ferruginea Royle in the north-west of India and from 33 to 39% in Pakistan, respectively [41,42]. Another study described lower values of oil yield by mill extraction for Olea ferruginea Royle from Pakistan, within the range from 11.1 to 12.5% [34]. A few cuspidata genotypes showed values for these two traits close to the ones obtained for the cultivars, which indicates that potentially interesting values for these attributes can be recovered in a single generation.
The relationship between fruit weight or oil content and total metabolite content (right and lower parts of Figure 2) was not so clear, even though a significant negative correlation was observed in both cases. Similar results were obtained also for either individual components or different chemical categories (data not shown). Higher contents of some others minor compounds such as tocopherols, associated with concomitant lower values for fruit size and oil content has been also reported in non-cultivated olive plant materials [15]. Additionally, this negative relationship is always found with lower values for phenolic and other minor components as fruit size and oil content increase during fruit ripening [30,43].

Quantitative Evaluation of the Selected Individual Compounds and Principal
Component Analysis to Explore the Natural Clustering of the Samples Table 3 presents a summary of the quantitative data. The quantitative data for each and every compound quantified in the progeny, the female parent and the cultivars have been included in Table S1 (Supplementary Material). Two independent replicates of each cuspidata genotype and cultivar samples (n = 28 × 2 (cuspidata) and n = 4 × 2 (cultivars), respectively), injected twice, were used to obtain the final quantitative values. The relationship between fruit weight or oil content and total metabolite content (right and lower parts of Figure 2) was not so clear, even though a significant negative correlation was observed in both cases. Similar results were obtained also for either individual components or different chemical categories (data not shown). Higher contents of some others minor compounds such as tocopherols, associated with concomitant lower values for fruit size and oil content has been also reported in non-cultivated olive plant materials [15]. Additionally, this negative relationship is always found with lower values for phenolic and other minor components as fruit size and oil content increase during fruit ripening [30,43].

Quantitative Evaluation of the Selected Individual Compounds and Principal
Component Analysis to Explore the Natural Clustering of the Samples Table 3 presents a summary of the quantitative data. The quantitative data for each and every compound quantified in the progeny, the female parent and the cultivars have been included in Table S1 (Supplementary Material). Two independent replicates of each cuspidata genotype and cultivar samples (n = 28 × 2 (cuspidata) and n = 4 × 2 (cultivars), respectively), injected twice, were used to obtain the final quantitative values.
As observed in Table 3, most of the 27 compounds selected to be quantified were determined in all the genotypes of the cuspidata progeny, with the exception of methoxy oleuropein, demethyl ligstroside, ligstroside and demethyl oleuropein, which were quantified in 27 samples; luteolin 7-O-glucoside and β-hydroxy verbascoside, which were determined in 26 samples; and verbascoside, which was only quantified in 18 wild olive fruit extracts. Metabolites that were not found in all samples of O. europaea subsp. europaea were β-hydroxy verbascoside, verbascoside and the unknown compound with m/z 363, quantified in three of the four cultivars; methoxy oleuropein (A) and demethyl oleuropein, determined in two cultivars ("Arbequina" and "Frantoio"); and neonuzhenide and demethyl ligstroside, which were only quantified in "Frantoio".
The main differences between the pulp of cuspidata and europaea samples appeared to be associated with flavonoids, particularly rutin. It was the most abundant flavonoid in both types of samples, but its concentration in cuspidata was five times higher than in the cultivars. The organic acids and pentacyclic triterpenes exhibited similar concentrations in the two types of samples and simple phenols were higher in cuspidata pulp, but not by much. Although, in the secoiridoid family, verbascoside and oleuropein were the predominant metabolites for both progeny and conventional olives, some differences were observed.
Substances such as demethyl ligstroside, oleoside/secologanoside (C), ligstroside, lucidumoside C (A), demethyl oleuropein and elenolic acid glucoside (C) were consistently more abundant in wild olives, whereas, for instance, dihydro oleuropein, oleuropein aglycone (isomers A and B) and acyclodihydroelenolic acid hexoside (B) were, on average, more abundant in cultivars. Figure 3 shows the quantitative distribution of some compounds in the samples of progeny, the female parent of open pollination progeny and the cultivars. In all cases, the x-axis shows the concentration in g·kg −1 and the y-axis the frequency (the number of samples that exhibited concentrations in a given range); letters (to facilitate interpretation) indicate in which group the female parent or the different cultivars fell. The range of variability for the cuspidata progeny markedly expands the value of their corresponding female parents for total metabolite contents, achieving, therefore, a huge improvement in one single generation. Cultivars showed intermediate ranges of total metabolite concentration, while the highest values were found for some cuspidata samples. Table 3. Summary of the quantitative data obtained for the metabolites quantified in the cuspidata progeny and female parent, and the cultivars ("Arbequina", "Frantoio", "Koroneiki" and "Picual"). The compounds are ordered in the table by chemical classes and increasing concentrations in the progeny. The N column indicates the number of times each compound was quantified in each group. When the oleuropein histogram was studied in detail, it was noted that most of the wild progeny clustered together with their female parents in the lowest range of concentration, although some exceptional genotypes showing high oleuropein contents were also obtained among the cuspidata progeny. Oleuropein aglycone (isomers A and B) prevailed in cultivars, especially in "Picual" and "Frantoio"; although there were some cuspidata samples that were richer than the cultivars, the most common situation was that the greatest number of genotypes fell into the lower concentration ranges (together with the female parent). Rutin showed a histogram quite different from those just discussed. "Arbequina", "Frantoio", "Koroneiki" and "Picual" exhibited the lowest concentrations; the female parent of the progeny, however, showed concentrations at least three times higher than those of the cultivars. Eleven genotypes had rutin levels equal to or higher than those of the female parent (reaching values of up to 14.1 g·kg −1 ) and all were substantially richer than the cultivars. The quinic acid content of the progeny appeared to be comparable to that of the cultivars ranging from 10.3 to 14.3 g·kg −1 for 18 of the genotypes studied. ples that exhibited concentrations in a given range); letters (to facilitate interpretation) indicate in which group the female parent or the different cultivars fell. The range of variability for the cuspidata progeny markedly expands the value of their corresponding female parents for total metabolite contents, achieving, therefore, a huge improvement in one single generation. Cultivars showed intermediate ranges of total metabolite concentration, while the highest values were found for some cuspidata samples. Some of the details just mentioned were also revealed in the principal component analysis (PCA), which was used to perform a preliminary exploratory analysis of the variability between and within the groups of samples evaluated (Figure 4). The PCA score plots obtained using the entire LC-MS quantitative data set are displayed in a twodimensional plot using the first two principal components (left of Figure 4), which covered 24.0% and 15.0% of the total variance, respectively. The graph shows a quite clear separation among the cultivars and the genotypes from the progeny, although the cuspidata samples were spread over several areas of the plot, which would mean that a relatively wide range of variability was observed over the entire progeny. The right part of Figure 4 shows the loading plots of the PCA model. The meaning of the numbers assigned to each compound is shown in the figure caption; these were assigned considering the relative abundance (in decreasing order) in the progeny samples.

Family
female parent of the progeny, however, showed concentrations at least three times higher than those of the cultivars. Eleven genotypes had rutin levels equal to or higher than those of the female parent (reaching values of up to 14.1 g⋅kg −1 ) and all were substantially richer than the cultivars. The quinic acid content of the progeny appeared to be comparable to that of the cultivars ranging from 10.3 to 14.3 g⋅kg −1 for 18 of the genotypes studied.
Some of the details just mentioned were also revealed in the principal component analysis (PCA), which was used to perform a preliminary exploratory analysis of the variability between and within the groups of samples evaluated (Figure 4). The PCA score plots obtained using the entire LC-MS quantitative data set are displayed in a two-dimensional plot using the first two principal components (left of Figure 4), which covered 24.0% and 15.0% of the total variance, respectively. The graph shows a quite clear separation among the cultivars and the genotypes from the progeny, although the cuspidata samples were spread over several areas of the plot, which would mean that a relatively wide range of variability was observed over the entire progeny. The right part of Figure 4 shows the loading plots of the PCA model. The meaning of the numbers assigned to each compound is shown in the figure caption; these were assigned considering the relative abundance (in decreasing order) in the progeny samples.  PC1 correlated positively, mainly, with hydroxytyrosol glucoside, unknown 1 (m/z 363), neonuzhenide, luteolin glucoside (B) and betulinic acid, and negatively with oleuropein and verbascoside. PC2 was positively related to oleanolic acid, acyclodihydroelenolic acid hexoside (B) and caffeoyl 6-secologanoside, among other compounds.
In view of the loading plot, it can be stated that five secoiridoids are able to define a fairly typical pattern for samples of O. europaea subsp. europaea, with these substances being the following: dihydro oleuropein, acyclodihydroelenolic acid hexoside (B), caffeoyl 6-secologanoside and isomers A and B of oleuropein aglycone.

Quantitative Results Structured by Chemical Classes
In this section, we intend to discuss the results considering the different families of metabolites that were determined, i.e., flavonoids, organic acids, triterpenes, secoiridoids, simple phenols and unknowns. For this purpose, Figure 5 shows a graph describing the compositional pattern of each sample according to the percentage that each family of compounds represents with respect to the total concentration of metabolites (with all values normalized to the maximum metabolite concentration found for each sample).
In this section, we intend to discuss the results considering the different families of metabolites that were determined, i.e., flavonoids, organic acids, triterpenes, secoiridoids, simple phenols and unknowns. For this purpose, Figure 5 shows a graph describing the compositional pattern of each sample according to the percentage that each family of compounds represents with respect to the total concentration of metabolites (with all values normalized to the maximum metabolite concentration found for each sample).  "Koroneiki", "Frantoio" and "Picual" seem to have a percentage distribution of the different chemical classes quite comparable to each other. "Arbequina" was not found to have a very similar compositional distribution to the other cultivars, showing the highest percentage of simple phenols (5.3%), pentacyclic acids (43.3%) and flavonoids (8.4%).
The relative abundance of the different families of compounds in the female parent is not comparable with any of the cultivars. A great content of flavonoids (6.1-22.4%) was observed in the progeny, although none of the evaluated cuspidata genotypes exceeded the percentage of flavonoids found in the female parent (24.4%). Simple phenols and pentacyclic triterpenes ranged from 0.7 to 15.4% and 11.2 to 37.2%, respectively, in the wild olive extracts. Organic acids and secoiridoids showed the highest overall and maximum percentages in the genotypes of the progeny, ranging from 10.2 to 40.0% and 13.0 to 64.5%, respectively. It would be possible to establish a hypothetical correlation between these two families, since, in general, the lower the percentage of quinic acid in a sample, the higher the percentage of secoiridoids found.

Preliminary Exploration of Metabolic Pathways: Cross-Correlation of the Secondary Metabolites Determined in the Progeny
The metabolic biosynthesis pathways in olive matrices are exceptionally complex. A great diversity in the structures and dynamic transformations of compounds are found during development, ripening, harvesting or olive oil extraction. Different pathways, including the shikimate, phenylpropanoid, mevalonate and flavonoid pathways, have been described as the basis for producing several precursors of phenolic compounds. Briefly, the shikimate pathway consists of the condensation of phosphoenolpyruvic acid and erythrose-4-phosphate to synthesize 3-dehydroquinic acid, which is transformed into shikimic acid. The final metabolite known as chorismic acid is synthesized in subsequent reactions and is a key branch point for the formation of L-Phenylalanine, which is the substrate of phenylpropanoid and flavonoid pathways [44]. Secoiridoids, the main iridoids found in Oleaceae, are biosynthesized by the mevalonate pathway from deoxylorganic acid. The connection of secoiridoids to the shikimate pathway is provided by two simple phenols (tyrosol and hydroxytyrosol) synthesized in the phenylpropanoid pathway [45][46][47]. For example, oleuropein is synthesized from hydroxytyrosol, which in turn is also related to ligstroside.
A cross-correlation for the metabolites determined in the progeny is shown in Table 4; it was carried out to evaluate whether a certain metabolic relationship could be established between some of the compounds under study in the present investigation. This table shows a positive significant correlation (p < 0.001) between luteolin glucoside and luteolin 7-Oglucoside, as well as oleuropein aglycones A and B, respectively. A balance in the synthesis of isomeric compounds could be the most plausible reason for these correlations. Likewise, a dynamic interconversion between some secoiridoids could be the cause of the significant positive correlation (p < 0.001) highlighted for some compounds in the cross-correlation table. The correlation noted in this table between dimethyl oleuropein and the unknown m/z 363 leads us to think that this compound could be a secoiridoid. Since its predicted molecular formula is C 19 H 24 O 7 , we hypothesize that it is a substance possibly related to ligstroside aglycone (perhaps with one less double bond).
In addition, quinic acid showed strong and inverse correlations (p < 0.001) with oleuropein and lucidumoside C (isomer A). Both quinic and shikimic acids have been described as precursors in the biosynthesis of several aromatic natural products in the shikimate pathway [48]. In this pathway, a reversible reduction of 3-dehydroquinic acid by quinic acid dehydrogenase occurs to produce quinic acid as a secondary metabolite [44]. Thus, a high content of quinic acid in olive fruit would lead to a lower amount of chorismic acid and L-phenylalanine and, consequently, a lesser amount of hydroxytyrosol. The biosynthesis of secoiridoids is interrelated with simple phenols, such as hydroxytyrosol, and their low availability could lead to a reduced formation of oleuropein and lucidumoside C. All these significant correlations among metabolites could be also inferred from the previously shown loading plot of PCA. Subsequent studies should, however, test this hypothesis.

Plant Materials
The used materials included olive fruits from 27 cuspidata genotypes coming from free pollination and their corresponding female parent. In addition, fruit samples from the cultivars "Arbequina", "Frantoio", "Picual" and "Koroneki" were also included in the experiment for comparison. The genotype acting as the female parent belongs to the wild olive Germplasm Bank preserved at the Institute of Agricultural and Fishery Research and Training, Córdoba, Spain [49]. Fruit samples (around 1 kg) were randomly collected for each plant on a common date (mid-October).

Chemicals and Regents
All the reagents were of analytical grade or LC-MS and used as received in the laboratory. Ethanol (EtOH) (in aqueous mixtures) was the solvent used for metabolite extraction and was supplied by Prolabo (Paris, France). Mobile phases were prepared using doubly deionized water with a conductivity of 18.2 MΩ obtained by using a Milli-Q system (Millipore, Bedford, OH, USA) (phase A) and LC-MS-grade acetonitrile (ACN) from Prolabo (Paris, France) (phase B) acidified with acetic acid (AcH), supplied by Sigma-Aldrich (St. Louis, MO, USA). Pure standards of organic acids (quinic acid), phenolic compounds (vanillin, p-coumaric acid, ferulic acid, hydroxytyrosol, tyrosol, rutin, oleuropein, luteolin, luteolin 7-O-glucoside, verbascoside, apigenin, apigenin 7-O-glucoside and pinoresinol) and pentacyclic triterpenes (maslinic, betulinic and oleanolic acids, erythrodiol and uvaol) were acquired from Sigma-Aldrich (St. Louis, MO, USA). A stock solution was prepared by dissolving an appropriate amount of each metabolite in EtOH/H 2 O (80:20 v/v) and then different dilutions were prepared to obtain diverse concentration ranges for each individual compound. All the sample extracts and standard solutions were filtered through Clarinet TM 0.22 µm nylon syringe filters acquired from Bonna-Agela Technologies (Wilmington, DE, USA). Mobile phases were filtered through a Nylaflo TM 0.45 µm nylon membrane filter supplied by Pall Corporation (Michigan, MI, USA). All the solutions were stored in dark flasks at −23 • C.

Fruit Weight and Oil Content
From each sample, three subsamples of around 25 g were randomly selected to produce dried samples sizes suitable for NMR sample holder. Fruit fresh weight was measured and, after drying in a forced-air oven at 105 • C for 42 h to ensure dehydration, oil content was determined using an NMR fat analyzer (Minispec MQone, Bruker Optik GmbH, Ettlingen, Germany) and expressed as a percentage on a dry weight basis [50].

Extraction and LC-MS Analysis of Fruit Metabolites
A representative sample of 50 fruits were destoned and the pulp was chopped, lyophilized and crushed to a fine powder and frozen at −23 • C. The applied metabolite-extraction procedure was the one previously reported by Olmo-García and colleagues [33,51], with a few modifications. Briefly, sample extracts were prepared by mixing 0.2 g of freeze-dried and homogenized pulp with 10 mL of EtOH/H 2 O (60:40, v/v) in a 15 mL falcon tube. After 1 min of vortex shaking, the tube was put into an ultrasound bath for 30 min and centrifuged for 5 min at 8000 rpm. Once the two phases were separated, the supernatant was transferred to a flask. The pellet was re-extracted twice by adding 10 mL of EtOH/H 2 O (80:20, v/v), applying, in both cases, the same procedure as in the first extraction. The use of EtOH/H 2 O mixtures in varying proportions ensured the effective extraction of the compounds of interest belonging to different chemical classes. All the supernatants (coming from the 3 extraction cycles) were mixed and about 1 mL of sample extract was placed in an HPLC vial after being filtered with a nylon syringe filter of 0.22 µm.
Two different LC-MS platforms were used in this study. The LC-MS system with a HRMS analyzer was used for qualitative purposes, whereas the LC platform coupled to an LR-MS was used to carry out the quantitation of the analytes of interest. For qualita-tive purposes, the used LC-MS platform consisted of an Acquity UPLC™ H-Class system coupled to a quadrupole-time-of-flight (QTOF) SYNAPT G2 MS (Waters, Manchester, UK). This instrument provided an accurate mass and the isotopic pattern which allowed us to predict the molecular formulae of the detected compounds, which greatly facilitates compound annotation and the identification of unknown peaks in complex matrices. Thus, the analysis of samples with HRMS helped us to describe the qualitative profiles of the samples under study. Afterwards, quantitative analyses were performed on a 1260 Infinity Agilent (Agilent Technologies, Waldbronn, Germany) coupled to an Esquire 2000 ion trap (IT) mass spectrometer (Bruker Daltonics, Bremen, Germany), which allowed us to quantify the targeted compounds by using standard calibration curves. Both MS instruments were equipped with an electrospray (ESI) interface. The selected column was an analytical Zorbax Extend C 18 column (4.6 × 100 mm; 1.8 µm particle size) working at 40 • C. Water with 1% AcH (v/v) (phase A) and ACN with 1% AcH (v/v) (phase B) were used as mobile phases. A solvent gradient was applied for the separation of analytes and the mobile phase composition changed as follows: 0 min, 90% A and 10% B; 10 min, 75% A and 25% B; 12 min, 40% A and 60% B; 14 min, 20%A and 80%B; 18 min, 0%A and 100%B. At 20 min, the system returned to the initial conditions and the column was re-equilibrated for 3 min. The flow rate was kept constant at 1 mL/min and the injection volume was set at 10 µL. The IT MS data were acquired in full-scan mode for a mass range from 50 to 1000 m/z and the system was operated in the negative polarity mode. As far as the ESI source is concerned, the operating parameters were as follows: the nebulizer gas (nitrogen) was set at 30 psi, the dry gas flow rate was fixed at 9 L/min and dry gas temperature at 300 • C, the capillary voltage was set at +3200 V and the end-plate offset at −500 V. For HRMS analyses, these parameters were transferred to the ESI-QTOF MS system.
To operate the LC and the LR-MS systems, the Agilent ChemStation (Agilent Technologies) and Esquire Control (Bruker Daltonics) were used, respectively. The HRMS platform was controlled by means of MassLynx (Waters). The data processing was performed by using DataAnalysis v 4.0 software (Bruker Daltonics, Bremen, Germany) and Microsoft Excel v 2204.

Statistical Analysis
The variability for the metabolites quantified in the cuspidata progeny and the cultivars ("Arbequina", "Frantoio", "Koroneiki" and "Picual") was studied. Correlations between fruit weight, oil content and total metabolite content as well as the cross-correlation for the metabolites quantified in the progeny were evaluated. Principal component analysis (PCA) was performed to test the relations among the different phenolic and triterpenic compounds as well as samples' grouping by genotype. Statistix (Analytical Software, Tallahassee, FL, USA) and Unscrambler (CAMO A/S, Trondheim, Norway) were used for the statistical analysis.

Conclusions
This contribution presents the first in-depth characterization (qualitatively and quantitatively) of fruit samples from Olea europaea subsp. cuspidata. By means of a powerful LC-MS method, about 60 compounds were identified and the most representative ones were quantified. The metabolic profiles of a progeny bred through the open pollination of cuspidata were compared with those of a sample of cultivars, showing that the genotypes from the progeny, overall, were richer in bioactive compounds than the cultivars and, particularly, in terms of the concentrations of rutin, hydroxytyrosol glucoside, several interesting secoiridoids and the compounds of m/z 421 and 363. These results suggest that the inclusion of cuspidata could be very interesting for the introgression of potentially interesting compounds in breeding programs. Studies such as this one make it possible to take advantage of the potential of food metabolomics for the identification and maintenance of olive genetic diversity.